Biostatistics For Dummies (Monika Wahi John Pezzullo)

Synergy and anti-synergy

Sometimes, two predictor variables exert a synergistic effect on an outcome. That is, if both

predictors were to be associated with an increase in the outcome by one unit, the outcome would

change by more than the sum of the two increases, which is what you’d expect from changing each

value individually by one unit. You can test for synergy between two predictors with respect to an

outcome by fitting a model that contains an interaction term, which is the product of those two

variables. In this equation, we predict SBP using Age and Weight, and include an interaction term for

Age and Weight:

SBP = Age + Weight + Age * Weight

If the estimate of the slope for the interaction term has a statistically significant p value, then the null

hypothesis of no interaction is rejected, and the two variables are interpreted to have a significant

interaction. If the sign on the interaction term is positive, it is a synergistic interaction, and if it is

negative, it is called an anti-synergistic or antagonistic interaction.

Introducing interaction terms into a fitted model and interpreting their significance — both

clinically and statistically — must be done contextually. Interaction terms may not be appropriate

for certain models, and may be required in others.

Collinearity and the mystery of the disappearing significance

When developing multiple regression models, you are usually considering more predictors than just

two as we used in our example. You develop iterative models, meaning models with the same outcome

variable, but different groups of predictors. You also use some sort of strategy in choosing the order in

which you introduce the predictors into the iterative models, which is described in Chapter 20. So

imagine that you used our example data set and — in one iteration — ran a model to predict SBP with

Age and other predictors in it, and the coefficient for Age was statistically significant. Now, imagine

you added Weight to that model, and in the new model, Age was no longer statistically significant!

You’ve just been visited by the collinearity fairy.

In the example from Table 17-2, there’s a statistically significant positive correlation between each

predictor and the outcome. We figured this out when running the correlations for Figure 17-1, but you

could check our work by using the data in Figure 17-2 in a straight-line regression, as described in

Chapter 16. In contrast, the multiple regression output in Figure 17-2 shows that neither Age nor

Weight are statistically significant in the model, meaning neither has regression coefficients that are

statistically significantly different from zero! Why are they associated with the outcome in correlation

but not multiple regression analysis?

The answer is collinearity. In the regression world, the term collinearity (also called

multicollinearity) refers to a strong correlation between two or more of the predictor variables. If you

run a correlation between Age and Weight (the two predictors), you’ll find that they’re statistically

significantly correlated with each other. It is this situation that destroys your statistically significant p

value seen on some predictors in iterative models when doing multiple regression.

The problem with collinearity is that you cannot tell which of the two predictor variables is actually